Skip to content

Conversation

@federico-stacks
Copy link
Contributor

@federico-stacks federico-stacks commented Oct 28, 2025

Description (UPDATED)

This PR adds consensus test coverage for the ParseError and related ParseErrors enum variants (alias ParseErrorKind) in preparation for the upcoming refactor activities.

What's done

  • added test coverage for ParseErrors variants for the scenarios that are functionally feaseable.
  • added test documentation to point to the related variant being tested
  • added a report that for each variant sum-up "in-code" if it has been put under test or not with related reason (see variant_coverage_report function trick).
  • added a bunch of unit tests related to build_ast function as support for writing the consensus ones. Also added a cost tracker facility that allow to use the LimitedCostTracker and real cost functions.
  • Prevented test process from exploding (due to console output) when converting to string the IllegalASCIIString(String) variant (when the string field is bigger then 1MB), ellipsing the string (only for test). This is caused by this info log statement that print the clarity error as debug:
    info!(
    "Runtime error in contract analysis for {contract_id}: {other_error:?}";
    "txid" => %tx.txid(),
    );
  • Fixed CircularReference(Vec<String>) making the string list deterministic in terms of ordering (otherwise test result would flip/flop due to circular reference random ordering)
  • improved the Test Harness macros contract_deploy_consensus_test! and contract_call_consensus_test! so that now it is possible to encapsulate them in a "native" [test] function. By this we could now add proper rust documentation for test cases and having the IDE browsing support for the tests (e.g. outliner view, etc...)

Heads-up

> Stack Depth
A note about how I succeeded to test the VaryExpressionStackDepthTooDeep. It was only possible because of a "gap" on how the stack depth limit is managed while parsing:

  • in Parser::parse_node(..) with MAX_NESTING_DEPTH = AST_CALL_STACK_DEPTH_BUFFER + MAX_CALL_STACK_DEPTH + 1 ( total = 70) it is managed the nesting depth (with different rules for list and tuple) and if it pass,
  • in VaryStackDepthChecker::check_vary(..) a second check is done with depth to be < AST_CALL_STACK_DEPTH_BUFFER + MAX_CALL_STACK_DEPTH (total 69), applying same rule for list and tuple.

Basically with this difference of 1 (70 vs 69), I succeeded to make the test case pass the first step and fail in the second step.

Just to make sure I understand: we already know that 64 is the stack limit and that we allow a bit of extra “space,” but I’d like to better understand why we’re using different stack limits for the two checks?

> Odd ParseErrors Conversion
I notice some cases where a ParseErrors (alias ParseErrorKind) is obtained by converting from other error families.
In this example, Value::buff_from(bytes) produce an InterpreterError (alias VmExecutionError), internally due to a possible CheckErrors (alias CheckErrorKind), which is then remapped to a ParseErrors (alias ParseErrorKind) with the match expression

Token::Bytes(data) => {
let mut expr = match hex_bytes(data) {
Ok(bytes) => match Value::buff_from(bytes) {
Ok(value) => PreSymbolicExpression::atom_value(value),
_ => {
self.add_diagnostic(
ParseErrors::InvalidBuffer,
token.span.clone(),
)?;
PreSymbolicExpression::placeholder(token.token.reproduce())
}
},
Err(_) => {
self.add_diagnostic(
ParseErrors::InvalidBuffer,
token.span.clone(),
)?;
PreSymbolicExpression::placeholder(token.token.reproduce())

Maybe clarity types (in clarity-types/src/types/mod.rs) should have their own error layer that could be then converted properly by the user.

> ParseErrors::NameAlreadyUsed rename?
This error variant seems only used in case of Trait name. So it maybe renamed accordly (like TraitNameAlreadyUsed)

DefineFunctions::Trait => {
if args.len() != 2 {
return Err(ParseErrors::DefineTraitBadSignature.into());
}
match (&args[0].pre_expr, &args[1].pre_expr) {
(Atom(trait_name), List(trait_definition)) => {
// Check for collisions
if contract_ast.referenced_traits.contains_key(trait_name) {
return Err(
ParseErrors::NameAlreadyUsed(trait_name.to_string()).into()
);

> Duplicated MAX_STRING_LEN const
This constant seems to be duplicated in clarity and clarity-types with different types, but same value:

  • clarity::vm::ast::parser::v2::MAX_STRING_LEN: usize = 128;
  • clarity-types::representations: MAX_STRING_LEN u8 = 128
    • then also exposed by clarity crate as clarity::vm::representations::MAX_STRING_LEN

Could we merge the two? Maybe the usize one to u8 considering that the second is exposed by the clarity-types?

Possible Follow-ups

  • Adding more unit-tests for build_ast function. I noticed that we miss unit test coverage for this function (a lot of parse error variants don't have related test cases), so could be a valuable tasks.
  • About the "string ellipsing" on IllegalASCIIString(String) , a better approach would be to do it on the ParseErrors enum side, but that would require to implement a custom Debug trait implementation for the enum. To avoid to implement the debug format for each variant, it would be possible to use some crate like derive-debug that allows to override the debug format selectively at variant level. If we are interested in this, then it could be a follow-up.
  • Refactoring activity around the ParseErrors variants to clarify better the code intent. (unreachable variants, separating parse error from cost error, remove if possible the From error traslation, etc...). NOTE: For sure before starting would be better to have "aac plowing" phase completed (with the enum renaming), to avoid a conflict hell

Draft Description (OLD)

This PR adds consensus test coverage for the ParseError enum in preparation for the upcoming refactor of that enum. The PR is currently a draft to gather early feedback on test organization and macro usage.

The PR is currently a draft to gather early feedback about test organization and macro usage::

  1. Although this is not to be intended as the final structure, I’ve started isolating ParseError related tests in a dedicate module parse_tests.rs
  2. To support this, I've exported the contract_deploy_consensus_test! and contract_call_consensus_test! macros opting for:
    • scoped export, to prevent visibilitty at the crate root level
    • impoort consistency, so macros can be used without additional imports (i.e., resolving internal macro dependencies using absolute namespaces).
  3. Futhermore I've added clarity_types crate as test dependency for documentation purposes (doc referencing Parse error variants). This is just a preliminary attempt, and I open to remove it if it's not beneficial (I'm also trying understanding this)
    • Currently, it doesn’t work properly with the macro, causing IDE warnings and broken hyperlinks (would function correctly with a standard test function)

    • I also noticed that with the macro we loose the Outline view (is empty)

      So, apart the doc thing, I'm wondering if in general would be convenient having "native" test functions while keeping the macro just for configuring the test body. Something like this:

    #[test]
    fn test_parse_error_lexer() {
        contract_deploy_consensus_test!(
           parse_error__import_trait_bad_signature,
           contract_name: "my-contract",
           contract_code: &{"(use-trait)"},
       );
    }

Applicable issues

Additional info (benefits, drawbacks, caveats)

Checklist

  • Test coverage for new or modified code paths
  • Changelog is updated
  • Required documentation changes (e.g., docs/rpc/openapi.yaml and rpc-endpoints.md for v2 endpoints, event-dispatcher.md for new events)
  • New clarity functions have corresponding PR in clarity-benchmarking repo

@federico-stacks federico-stacks self-assigned this Oct 28, 2025
@federico-stacks federico-stacks added aac Avoiding Accidental Consensus aac-testing Avoiding Accidental Consensus Testing Specific Task labels Oct 28, 2025
@federico-stacks federico-stacks linked an issue Oct 28, 2025 that may be closed by this pull request

/// ParserError: [`ParseErrors::Lexer`]
/// Caused by: unknown symbol
/// Outcome: block accepted.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like your comments. Clear, concise, easy to follow/reason about.

@jacinta-stacks
Copy link
Contributor

I actually think it might be slightly easier to reason about the test name/be able to find tests if we did something like

#[test]
fn test_parse_error_lexer() {
    contract_deploy_consensus_test!(
       function_name!(),
       contract_name: "my-contract",
       contract_code: &{"(use-trait)"},
   );
}

since when I initially looked at contract_deploy_consensus_test! macro it took me a bit to realize it was a #[test]. But I am not strongly one way or the other. Whichever makes it easier for you to reuse these functions in your test, go for it. If you need help fixing your errors, let me know. I will give it a look.

@federico-stacks
Copy link
Contributor Author

I actually think it might be slightly easier to reason about the test name/be able to find tests if we did something like

#[test]
fn test_parse_error_lexer() {
    contract_deploy_consensus_test!(
       function_name!(),
       contract_name: "my-contract",
       contract_code: &{"(use-trait)"},
   );
}

since when I initially looked at contract_deploy_consensus_test! macro it took me a bit to realize it was a #[test]. But I am not strongly one way or the other. Whichever makes it easier for you to reuse these functions in your test, go for it. If you need help fixing your errors, let me know. I will give it a look.

Considering we have a macro, we should be able to use function_name!() internally.
Otherwise, if we also expose the function_name() as in your example, we could even replace the macro with a function.

@federico-stacks
Copy link
Contributor Author

Based on the feedback, I've updated consensus macros as proposed in the PR description.
It seems to work nicely (with documentation, but also I added a should_panic test for unreachable error variant)

By the way, adding more tests exposed a new issue related to the vm_error field, basically this test:

#[test]
fn test_circular_reference() {
    contract_deploy_consensus_test!(
        contract_name: "my-contract",
        contract_code: &{"
            (define-constant my-a my-b)
            (define-constant my-b my-a)
        "},
    );
}

produce a non-deterministic vm_error description. Here the possible output:

  • detected interdependent functions (my-a, my-b)
  • detected interdependent functions (my-b, my-a)

where the element listed in the paranthesis can have a different order and the flip/flop between them at each run.

@codecov
Copy link

codecov bot commented Oct 30, 2025

Codecov Report

❌ Patch coverage is 87.39496% with 15 lines in your changes missing coverage. Please review.
✅ Project coverage is 74.38%. Comparing base (f8d02cd) to head (298959a).

Files with missing lines Patch % Lines
stackslib/src/chainstate/tests/parse_tests.rs 85.98% 15 Missing ⚠️

❌ Your project check has failed because the head coverage (74.38%) is below the target coverage (80.00%). You can increase the head coverage or adjust the target coverage.

Additional details and impacted files
@@             Coverage Diff             @@
##           develop    #6631      +/-   ##
===========================================
+ Coverage    64.85%   74.38%   +9.52%     
===========================================
  Files          574      575       +1     
  Lines       355039   355145     +106     
===========================================
+ Hits        230278   264188   +33910     
+ Misses      124761    90957   -33804     
Files with missing lines Coverage Δ
stackslib/src/chainstate/tests/consensus.rs 95.00% <100.00%> (+9.78%) ⬆️
stackslib/src/chainstate/tests/mod.rs 70.08% <ø> (ø)
stackslib/src/chainstate/tests/parse_tests.rs 85.98% <85.98%> (ø)

... and 373 files with indirect coverage changes


Continue to review full report in Codecov by Sentry.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update f8d02cd...298959a. Read the comment docs.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Copy link

@francesco-stacks francesco-stacks left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good changes! Definitely fine with the named tests.

PS:I am sure you already noticed, but some unit tests are currently failing

@federico-stacks
Copy link
Contributor Author

Good changes! Definitely fine with the named tests.

PS:I am sure you already noticed, but some unit tests are currently failing

Yeah! This is due the vm_error for the test_circular_reference being not deterministic as mentioned in this comment: #6631 (comment)

@federico-stacks federico-stacks force-pushed the chore/aac-parse-error-test branch from 27b1d56 to a81db5d Compare October 31, 2025 12:10
@federico-stacks
Copy link
Contributor Author

By the way, adding more tests exposed a new issue related to the vm_error field, basically this test:

#[test]
fn test_circular_reference() {
    contract_deploy_consensus_test!(
        contract_name: "my-contract",
        contract_code: &{"
            (define-constant my-a my-b)
            (define-constant my-b my-a)
        "},
    );
}

produce a non-deterministic vm_error description. Here the possible output:

  • detected interdependent functions (my-a, my-b)
  • detected interdependent functions (my-b, my-a)

where the element listed in the paranthesis can have a different order and the flip/flop between them at each run.

I opted to manage this issue, forcing the CircularRefercence being emitted with a determinitic list.
see commit: a81db5d

@federico-stacks federico-stacks force-pushed the chore/aac-parse-error-test branch from e62cd5c to d9c6948 Compare November 6, 2025 07:29
@federico-stacks federico-stacks force-pushed the chore/aac-parse-error-test branch from 1be51dc to 8c0a7fc Compare November 7, 2025 18:03
@federico-stacks federico-stacks marked this pull request as ready for review November 10, 2025 08:24
@federico-stacks federico-stacks requested review from a team as code owners November 10, 2025 08:24
@federico-stacks
Copy link
Contributor Author

For @jacinta-stacks and @francesco-stacks, who reviewed the PR while it was in draft: I’ve updated the PR description to reflect the final state of the work.

Copy link
Contributor

@jacinta-stacks jacinta-stacks left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Really clean. I like this a lot.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

aac Avoiding Accidental Consensus aac-testing Avoiding Accidental Consensus Testing Specific Task

Projects

None yet

Development

Successfully merging this pull request may close these issues.

AAC Testing: Add test coverage for ParseError enum

3 participants